Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pareto optimization #475

Open
wants to merge 47 commits into
base: main
Choose a base branch
from
Open

Pareto optimization #475

wants to merge 47 commits into from

Conversation

AdrianSosic
Copy link
Collaborator

This PR finally brings Pareto optimization via the new ParetoObjective class, together with an example comparing Pareto vs. single target optimization for a simple pair of synthetic targets.

Note: Support is currently limited to maximization and minimization targets. Match targets will follow but require a refactoring of the corresponding target transformation mechanism.

@AdrianSosic AdrianSosic added the new feature New functionality label Feb 3, 2025
@AdrianSosic AdrianSosic self-assigned this Feb 3, 2025
@AdrianSosic AdrianSosic force-pushed the feature/pareto branch 2 times, most recently from 385bebe to e633f92 Compare February 3, 2025 14:08
Copy link
Collaborator

@Scienfitz Scienfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first round of comments

@AdrianSosic AdrianSosic force-pushed the feature/pareto branch 5 times, most recently from 27ae08c to 711935f Compare February 13, 2025 15:22
AdrianSosic and others added 24 commits February 17, 2025 13:39
The ref_point is now in the original target space so that the user can
intuitively specify its coordinates. Sign flips for minimization targets
happen behind the scenes.
):
raise IncompatibleAcquisitionFunctionError(
f"You attempted to use a single-target acquisition function in a "
f"{n_targets}-target context."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the language here is not good due to the same issue elsewhere: a single target acqf is perfectly fine eg for a desirability objective (ie multiple targets), whats really the problem is the number of outputs (and not targets)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detached comment 3: working with Pareto I think the Campaign.posterior call is much more important and probaly part fo the workflow (to check the target predictions and possibly make a subchoice of poitns to evaluate on the predicted frontier). Imo this then should be mentioned both in the UG and in the example (albeit briefy, jsut referencing the posterior method and explaiing why its useful for that)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followup: for the API as well as less experienced users, it might be useful to have a posterior method or any other convenience object that returns a dataframe with targets, mean and var of posterior prediction, and not a posterior object

# Validate multi-target compatibility
if not self.supports_multi_target and (n_targets := len(objective.targets)) > 1:
raise IncompatibleSurrogateError(
f"You attempted to train a single-target surrogate in a "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"You attempted to train a single-target surrogate in a "
f"You attempted to train a single-output surrogate in a "



@define
class BroadcastingSurrogate(SurrogateProtocol):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting a supports_multi_target=True in this class but perhaps its not needed or you had other reasons?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesterday I answered "yes, thanks, forgot" but I just realize that I omitted it on purpose since actually not needed / fulfills no purpose here. The potential confusion is: note that the two composite classes are not subclasses of Surrogate (which is where the flag is declared), they only implement the Protocol. Because the flag is only used inside Surrogate, there is no benefit of also exposing a copy of it in the composite ones – they will at some point dispatch the the actual surrogates, which is where the check is then performed (and if coming via this route, it will not even be required since the number of encountered targets is then trivially 1). So it's only ever needed if you don't go via the composite surrogates. Makes sense? If yes, please resolve

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought having such flag available outside of Surrogate could be useful, think eg about tests

Comment on lines +354 to +358
The reference point is positioned in relation to the worst target configuration
within the provided array. The distance in each target dimension is adjusted by
a specified multiplication factor, which scales the reference point away from
the worst target configuration based on the maximum observed differences in
target values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The reference point is positioned in relation to the worst target configuration
within the provided array. The distance in each target dimension is adjusted by
a specified multiplication factor, which scales the reference point away from
the worst target configuration based on the maximum observed differences in
target values.
The reference point is positioned relative to the worst point in the direction coming from the best
point. A factor of 0.0 would result in the reference point being the worst point, while a factor > 0.0
would move the reference point further away from both worst and best points. A factor of 1.0 would
exactly mirror the best on on the worst point.

plt.ylabel(y1.name)
plt.title("Target Space")
plt.axis("square")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we mix target types in this example? ie one max and one min
reason: its safer to test mixed targets as both cases for the multiplier are contained


```{admonition} Non-dominated Configurations
:class: tip
A target configuration is considered non-dominated if no other configuration is better
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a newbie otherwise its probably not clear why the objective is even called Pareto

Suggested change
A target configuration is considered non-dominated if no other configuration is better
A target configuration is considered non-dominated (or Pareto-optimal) if no other configuration is better



@define
class CompositeSurrogate(SurrogateProtocol):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any merit to deriving BroadcastingSurrogate from CompositeSurrogate?
It seems like a special case to me

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, inheritance does not help here – the attribute layout and way of calling the initializer is different. But I know what you had in mind: a BroadcastingSurrogate indeed behaves like a CompositeSurrogate with a flexible collection of single-task surrogates. So what you probably meant – and what indeed work – is to compose the CompositeSurrogate in that case via an alternative construction route:

@classmethod
def from_template(cls, surrogate: SurrogateProtocol) -> CompositeSurrogate:
    return cls(defaultdict(lambda: deepcopy(surrogate)))

With that, doing a BroadcastingSurrogate(template) would be identical to CompositeSurrogate.from_template(template).

In fact:

  • the amount of code required would be significantly less (i.e. one entire class dropped)
  • everything should still work via the API since we have our clever constructor flag
  • we could still use the auto-conversion for single-target surrogates

But the consequence is that the creation step would no longer be a regular "class call" but requires this separate route instead. So which do we prefer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but storing a sequence of n models (which is what qualitativly Composite does) is just the more general case of creating such a sequence but populating it with n deepcopies of the same model (which is what Broadcasting does qualitatively). Moreover, look at the fit methods, they are largely identical. These similarities to me suggest that we at least should not have 2 fully independent classes here...

I think this could be solved via inheriting the special case from the general case, or having a common intermediate class or your proposed solution. The latter is probably the most lightweight and I would support this too.

We should still pair this with a converter that allows this simple use case: I create a Pareto objective but pass only a single-target model (no broadcasting, no wrapping, no other class used). Then we kind of have best of both worlds?

And can we anywhere integrate the fact that these then are fully independent models? I think this is important to distinguish it from the hirarchical variant.

if not self.supports_multi_output and (n_targets := len(objective.targets)) > 1:
raise IncompatibleSurrogateError(
f"You attempted to train a single-target surrogate in a "
f"{n_targets}-target context. Either use a proper multi-target "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"{n_targets}-target context. Either use a proper multi-target "
f"context requiring {n_targets} model outputs. Either use a proper multi-output "



@define
class BroadcastingSurrogate(SurrogateProtocol):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not mentioned anywhere in the surrogate userguide so far, but I think these are important explanations.

Im not calling for a complete revamp, but at least mention that the models currently mentioned are all single-output (with the exception of GP where it uses the builtin botorch mechanism)

and then mention the current ways of creating multi-output models via composite or broadcastin from the above

long term (not for this PR) I can imagine a section explaining (perhaps even with some flow charts) the four multi-output possibilities

  1. botorch builtin broadcasting (comp efficient, no partial measurements)
  2. broadcasting a single-target model, supports partial measurements
  3. composite, supports partial targets
  4. an inherently multi-output model not consisting of any independent models

do I understand it right that after this PR we already have possibilities 1 to 3 (without the partial targets of course)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants